1 Getting Started

This document provides an example of how we can use geoprocessing techniques to analyse population datasets. The aims of this exercise include:

  1. Demonstrate how we can load datasets within R
  2. Provide some quick visualisation examples to gain an understanding of the dataset
  3. Demonstrate commonly used geospatial techniques for analysing population datasets.

These methods are designed to provide an introduction to the second workshop exercise, which will build on these techniques and demonstrate how we could do a larger analysis around this work.

1.1 Setup

There are two main packages used for spatial data analysis:

  1. sp: this is the more established R spatial data analysis package.
  2. sf: this is the more recent implementation, and generally has easier to use syntax and clearer functions, along with performance improvements.

Although sf is generally seen as the more modern way of doing things, it is still partially in development and therefore there can be some operations which should be done is sp, particularly when dealing with raster datasets.

# Load core spatial packages
library(sp)
library(raster)
library(rgdal)
library(sf)
library(stars)

library(tmap)      # Easy way of producing maps

1.2 Load data

5 datasets are loaded into the analysis, representing typical spatial datasets. These datasets have been accessed through the GRID3 data portal.

  • Population Data: A raster with population data at 100m resolution for Nigeria
  • Built Up Areas: Place or area with clustered or scattered buildings and a permanent human population (city, settlement, town, village) and by definition has no legal boundaries
  • Dump Locations: This provides the location of waste facilities in the Lagos State
  • Roads: Primary roads covering the whole of Nigeria

First we will visualise the data. We have some boundary data for the states:

tmap_mode("view") # See interactive version of map 

tm_shape(boundaries) +
  tm_polygons(alpha = 0, border.col = "dodgerblue") +
  tm_shape(builtAreas) +
  tm_polygons() +
  tm_shape(dumps) +
  tm_dots(size = 0.01)

2 Geoprocessing Techniques

With our data, we will demonstrate three techniques:

  1. Clip:
  2. Buffer:
  3. Zonal Summary:

2.1 Clip vector data

We have road data for the whole of Nigeria which we can use to show clipping. We will use the st_intersection function for this:

roadsLagos <- sf::st_intersection(boundaries, roads)

The results of this clipping are shown below:

tmap_mode("plot")

# Plot Data
map_roads_nigeria <- 
  tm_shape(roads) +
  tm_lines() +
  tm_layout(title = "Roads for Nigeria")

map_boundary_lagos <- 
  tm_shape(roads) +
  tm_lines() +
  tm_shape(boundaries) +
  tm_polygons(alpha = 0.5, border.col = "dodgerblue") +
  tm_layout(title = "Clipping Boundary")

map_roads_lagos <- 
  tm_shape(boundaries) +
  tm_polygons(alpha = 0.5, border.col = "dodgerblue") +
  tm_shape(roadsLagos) +
  tm_lines() +
  tm_layout(title = "Clipped Dataset")

tmap_arrange(map_roads_nigeria, map_boundary_lagos, map_roads_lagos, ncol = 3)

2.2 Buffer data

We can apply buffers around objects, and can be applied to points, lines and polygons. This is useful to know which areas fall within a certain distance from an attribute. Note, we must convert the geographic coordinate system into a planar coordinate system to enable us to calculate distance. The topic of coordinate systems is not covered fully in this training. We calculate a buffer below using the st_buffer function, calculating a buffer of 5000 metres:

# Convert the geographic coordinate system to planar so that we can do distance based calculations
roadsLagos_planar <- st_transform(roadsLagos, crs = 26393)

# Calculate buffer
roads_buffer <- sf::st_buffer(roadsLagos_planar, 5000)

# Convert back to geographic coordinate system
roads_buffer <- st_transform(roads_buffer, crs = 4326)

The results are displayed below. A buffer has been created for each line of the shapefile, and therefore we now have multiple polygons, some of which are overlapping. Depending on your application, you may only need to have a single proximity buffer. We will show below how we can remove these overlapping datasets.

tmap_mode("view") # Create a static map 

# Plot Data
map_roads <-
  tm_shape(roadsLagos) +
  tm_lines() +
  tm_layout(title = "Road")

map_buffer <-
  tm_shape(roads_buffer) +
  tm_polygons() +
  tm_shape(roadsLagos) +
  tm_lines() +
  tm_layout(title = "Buffer")

# Plot results
tmap_arrange(map_roads, map_buffer, ncol = 2, sync = TRUE)

2.3 Union

If we only want a single polygon, we can use the st_union function to merge the shapefiles into a single object. This will remove any overlapping areas.

roads_buffer_dissolved <- st_union(roads_buffer)

We display these below:

# Plot Data
map_buffer_dissolved <-
  tm_shape(roads_buffer_dissolved) +
  tm_polygons() +
  tm_shape(roadsLagos, ) +
  tm_lines() +
  tm_layout(title = "Dissolved Polygons")

# Plot results
tmap_arrange(map_buffer, map_buffer_dissolved, ncol = 2, sync = TRUE)

2.4 Clip Raster Data

We can combine our spatial datasets with the GRID3 population data. Unfortunately rasters do not play well with sf yet. We must therefore convert the data to an sp object. R requires two steps:

  1. raster::crop()
  2. raster::mask()

We will use these both to crop the population raster to Lagos state:

roads_buffer_sp <- as(roads_buffer_dissolved, 'Spatial')
population_road_buffer <- raster::crop(population, roads_buffer_sp) %>%
  raster::mask(roads_buffer_sp)
tm_shape(roads_buffer_dissolved) +
  tm_polygons(alpha = 0) +
tm_shape(population_road_buffer) +
  tm_raster() +
  tm_layout(title = "Population within 5km of major road")

2.5 Summary Statistics

cellStats(x = population_road_buffer, sum)
## [1] 9090986
 

2016, GRID3

Project licence: GNU Affero General Public License v3.0